# High-performance Inference

Moonshotai Kimi Dev 72B GGUF
Kimi-Dev-72B is a large-scale language model developed by moonshotai. It is optimized through GGUF quantization and offers multiple quantization versions to meet different hardware requirements.
Large Language Model
M
featherless-ai-quants
290
1
E N V Y Legion V2.1 LLaMa 70B Elarablated V0.8 Hf GGUF
Legion-V2.1-LLaMa-70B-Elarablated-v0.8-hf is a quantized version based on LLaMa-70B, optimized using llama.cpp, offering multiple quantization options to accommodate different hardware requirements.
Large Language Model
E
bartowski
267
1
Nvidia AceReason Nemotron 7B GGUF
Other
AceReason-Nemotron-7B is a large language model based on the Nemotron architecture with 7B parameters, offering multiple quantized versions to accommodate different hardware requirements.
Large Language Model
N
bartowski
209
2
Nvidia OpenCodeReasoning Nemotron 14B GGUF
Apache-2.0
This is the Llamacpp imatrix quantized version of the NVIDIA OpenCodeReasoning-Nemotron-14B model, suitable for code reasoning tasks.
Large Language Model Supports Multiple Languages
N
bartowski
1,771
2
Deepseek R1 Distill Llama 70B Abliterated Mlx 4Bit
This is a distilled model based on Llama-70B, converted to MLX format via mlx-lm and quantized to 4 bits.
Large Language Model Transformers
D
cs2764
358
1
Qwen2.5 Smooth Coder 14B Instruct
Apache-2.0
This is a multi-model fusion result based on the Qwen2.5-14B architecture, utilizing the Model Stock fusion method, combining 22 different 14B-parameter scale models from various sources.
Large Language Model Transformers
Q
spacematt
38
2
Qwen2.5 Bakeneko 32b Instruct V2
Apache-2.0
An instruction-tuned variant based on Qwen2.5 Bakeneko 32B, enhanced with Chat Vector and ORPO optimization for improved instruction-following capabilities, excelling in Japanese MT-Bench.
Large Language Model Transformers Japanese
Q
rinna
140
6
Meta Llama 3 70B Instruct GGUF
The GGUF format version of Llama 3 70B Instruct, providing a more efficient local running experience
Large Language Model Transformers English
M
PawanKrd
468
4
ECE TW3 JRGL V5
Apache-2.0
ECE-TW3-JRGL-V5 is a new model obtained by merging the MoMo-72B-lora-1.8.7-DPO and alpaca-dragon-72b-v1 models through mergekit, integrating the advantages of multiple models.
Large Language Model Transformers
E
paloalma
10.59k
1
Yi 34B Chat
Apache-2.0
Yi-34B-Chat is a bilingual-optimized large language model developed by 01.AI, excelling in language understanding, commonsense reasoning, and reading comprehension, supporting both Chinese and English interactions.
Large Language Model Transformers
Y
01-ai
5,784
350
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase